Important: This page is generated from a Jupyter notebook, some of the code are hid under the hood, some of them can be shown by clicking the button Show Code. If you want to visit the complete notebook, please click the View On Github button above.

Introduction - (Motivation part)

Undoubtedly the recent appearrance and expansion of COVID-19 virus has affected the lives of billions of people worldwide in many aspects. Goverments have been under constant challenge to reduce social interaction in order to mitigate the possibilities of virus transmission. Therefore, they have introduced hard measurements to face this severe situation which have significant impact to every body's live.

Economy was on of the major areas that affected from those measurements. The work culture had to change to meet the derivative of the goverments, which led companies to move faster towards digitilisation. As a result companies that weren't eager in such changes to face important financial issues forcing them in many cases to reduce their human resources. For other companies such travelling agencies or copmanies in hospitalitty sector, the hit was even harder since they rely their profits entirely on the people's need for entertainment, social exploration etc.. Therefore, they have completely or partially shut down their operation leading many people in unemployment.

The above constitutes common observations and may look discouranging and demotivating facts for many people. However, we can not conclude how big this impact is in each country's overall economy without a more in depth investigation of actual facts.

Upon that, we came to the desicion to analyse data from macroeconomic point of view in order to get a more clear understanding of how the virus has affected our economy. We will start the study by presenting a statistical analysis of how the situation with regards to COVID-19 looks like in the countries countries around the globe and we will narrow the analysis to the most impacted ones (In the map above is illustrated the data about the confirmed cases worldwide). Then we will include financial data to explore whether there is a significant impact of the virus in the economy of those countries.

In order to carry out the analysis we will use data for COVID-19 from github gist which they are updated every day. In that way we will have overview daily on the situation about the expansion of the virus. The financial data derived from IMF, OECD and other sources which can be found at the end of the page. The reason we chose those datasets was that we believe they contain all the information needed to obtain the required outcome about the fincanial situation of the countries under consideration.

To sum up, from this study we aim to provide a conclusion about the economic consequences due to COVID-19. Through interactive and annotated graphs we want to give to the intendent audience all the information needed in order to understand the impact of COVID-19 in economy in a simple and concine manner.

Data analysis and visualization

We will start our analysis by illustrating the current situation of COVID-19 by a world map. The map has colored according to confirmed cases and summary of the rest of cases (e.g. Deaths, Recovered) can be seen by pointing on the country of interest.

Then we will continue by introducing the data about COVID-19 and later on in the study will go deeper in the economic data.

# collapse-hide
# data preperation, combine refrence dataset to virus dataset to obtain area code for map plot
refrence = refrence.rename(columns={'Country_Region': 'Country/Region'})
most_recent_data = world_data[world_data['Date'] == world_data['Date'].max()]
most_recent_data = most_recent_data[['Date', 'Country/Region', 'Confirmed','Recovered','Deaths']]
grouped = most_recent_data.groupby('Country/Region').sum()

result = grouped.join(refrence.set_index('Combined_Key'), on='Country/Region')
result = result.fillna(value=0)
result['code3'] = result['code3'].astype(int)

# confirm map
confirmMap = alt.Chart(alt.topo_feature(data.world_110m.url, 'countries'), title='COVID-19 Confirm Overview').mark_geoshape(
    stroke='#aaa', strokeWidth=0.25
).transform_lookup(
    lookup='id', from_=alt.LookupData(data=result, key='code3', fields=['Country/Region','Confirmed','Deaths','Recovered'])
).encode(
    alt.Color('Confirmed:Q',
              scale=alt.Scale(domain=[0, result.Confirmed.max()/10], clamp=True), 
              legend=alt.Legend(format='')),
    tooltip = [('Country/Region:O'),('Confirmed:Q'),('Deaths:Q'),('Recovered:Q')]
).project(
    type='equirectangular'
).properties(
    width=900,
    height=500
).configure_view(
    stroke=None
)

confirmMap

COVID-19 analysis

In this section, we will dive more into COVID-19 data to present the current situation of virus by illustrating the numbers of confirmed, recovered and death cases. Then with help of interactive represenation of those numbers we will try to understand the spread rate and distribution of COVID-19.

In the following table is shown a sample of the data regarding COVID-19. The dataset contains columns with the countries, confirmed and recovered cases as well as overall deaths per country.

Date Country Confirmed Recovered Deaths
20939 2020-05-12 West Bank and Gaza 375 308 2
20940 2020-05-12 Western Sahara 6 6 0
20941 2020-05-12 Yemen 65 1 10
20942 2020-05-12 Zambia 441 117 7
20943 2020-05-12 Zimbabwe 36 9 4

Exploration analysis

In this section we will perfrom a basic statistical analysis of the data in order to identify how the data are distibuted among the columns and to detect any important patterns that might be usefull in the further on analysis.

First, we will start by illustating the descriptive statistics of our dataset. In this way we can summarize the central tendency, dispersion and shape of our dataset's distribution.

In the table below it can be observed the great differences in the max values among the cases. The standard deviation is quite high in all the presented cases which means that our data is spread out. We can se also that the mean are very different as well. The overall average deaths across the countries is significantly lower that the confirmed and recovered cases. Meaning that the average death rate of the virus is 6.65 % and the average recovery rate is 28.35%. However, there are countries that have been impacted more than others and thus these rates are not equally distibuted among them.

Confirmed Recovered Deaths
count 2.094400e+04 20944.000000 20944.000000
mean 5.695126e+03 1653.403934 378.271104
std 4.423989e+04 10614.502146 3062.468010
min 0.000000e+00 0.000000 0.000000
25% 0.000000e+00 0.000000 0.000000
50% 1.000000e+01 0.000000 0.000000
75% 4.312500e+02 48.000000 7.000000
max 1.369376e+06 232733.000000 82356.000000

The figure below illustrates the confirmed cases per country, in total 187 countries are shown. We have set a threshold of 100000 confirmed cases (red line in the graph). Countries with more than 100000 cases are shown in red bars. Countries with confirmed cases between 10000-100000 are shown in orange bars while the rest with cases below 10000 cases are shown in blue.

It can clearly observed that the threshold of 100000 cases has been exceeded by Brazil, Italy, Spain, Iran, UK, USA, France, Germany, Russia and Turkey. Four of those countries (Spain, USA,Italy, UK) have cross the threshold of 200000 incidents, while the cases in USA have reached the extreme record of 1300000 cases. The above mentioned countries gather the 72.37% (by the day the report was written) of total confirmed cases worldwide. The 22,87% of the cases gathered in countries with confirmed cases in between 10000 and 100000 while the remaining 4.76% is recorded from the rest of the countries.

Another interesting observation is that Italy, USA, Germany, France and United Kingdom (countries that have been hit hardly by COVID-19) are among the seven largest IMF- advanced economies in the world. Meaning that potential impact in their economy due to virus could directly affect the global economy.

#collapse-hide
# mutiple color support, key is the plot color, value is the confirmed cases range
colorDict = {
    'blue': (0, 10000),
    'orange': (10001, 100000),
    'red': (100001, 100000000)
}

def addColorType(df, colorDict):
    # assign default color
    df['Color'] = 3
    for key, val in colorDict.items():
        df.loc[(df['Confirmed'] > val[0]) & (df['Confirmed'] <= val[1]), ['Color']] = key

Threshold = pd.DataFrame({'Threshold':[100000]})
# continuous coloring
domain = [10000, 100000, 100000000]
range_ = ['blue', 'red', 'green']
#summary of all the countries
# get the last day's data, The conifrmed cases is accumulated, so the last day's data includes all confirmed cases so far
plotData = full_clean_data.loc[full_clean_data.Date == full_clean_data.Date.max()]
addColorType(plotData, colorDict)
summary = alt.Chart(plotData).mark_bar().encode(
    x=alt.X('Country:O',sort='-y'),
    y=alt.Y("Confirmed:Q"),
    tooltip = [alt.Tooltip('Country'),
               alt.Tooltip('Confirmed')],
    # The highlight will be set on the result of a conditional statement
#     color=alt.Color('Confirmed', scale=alt.Scale(domain=domain, range=range_))
    color=alt.Color('Color', legend=None)
).properties(width=3000,height=400)

rule = alt.Chart(Threshold).mark_rule(color='red').encode(
    y=alt.Y('Threshold:Q'),tooltip = [alt.Tooltip('Threshold')]
               
)

(summary+rule)

In the figures belows is illustrated the maximum values of the cases for the corresponding (that have exceeded the threshold of 100000 confirmed cases) countries in order to identify which countries have recorded the highest numbers of confirmed, recovered and death incidents due to COVID-19. From a first sight we can observe that the cases are not proportional with each other. For example the countries with the most confirmed cases they don't necessarily record the most deaths or recovered cases.

Also we observe significant fluctuations on how the cases are distibuted among the countries. For instance, there is no consistency on how the cases increased in each country. This of course makes sense as the number of deaths, recovered etc. highly depends on factors such the health care system of each country, the population age and so on. Factors that are beyond the scope of this study.

# collapse-hide
group = full_clean_data.groupby('Country')['Deaths','Confirmed','Recovered'].max().sort_values(by=['Deaths','Confirmed','Recovered'])
group = pd.DataFrame(group)
group = group.reset_index()
# keep only the countries with more than 10000 deaths
new_group = group.query("Confirmed >= 100000")
countries = list(new_group.Country.unique())

#define colors
red = alt.value('#f54242')
green = alt.value('#137E2A')
black = alt.value('#050404')

#presenting the confirmed cases per country
bars = alt.Chart(new_group).mark_bar(size=5).encode(
    x='Confirmed:Q',
    y=alt.Y("Country:O", sort='-x'),color = red
)

text = bars.mark_text(
    align='left',
    baseline='middle',
    dx=3  # Nudges text to right so it doesn't appear on top of the bar
).encode(
    text='Confirmed:Q',color =black
)

bars2 = alt.Chart(new_group).mark_bar(size=5).encode(
    x='Recovered:Q',
    y=alt.Y("Country:O", sort='-x'),color=green
)

text2 = bars2.mark_text(
    align='left',
    baseline='middle',
    dx=3  # Nudges text to right so it doesn't appear on top of the bar
).encode(
    text='Recovered:Q',color=black
)

bars3 = alt.Chart(new_group).mark_bar(size=5).encode(
    x='Deaths:Q',
    y=alt.Y("Country:O", sort='-x'),color=black
)

text3 = bars3.mark_text(
    align='left',
    baseline='middle',
    dx=3  # Nudges text to right so it doesn't appear on top of the bar
).encode(
    text='Deaths:Q',color=black
)

laydermap = (bars + text).properties(width= 250,height=300)|(bars2+text2).properties(width= 250,height=300)|(bars3+text3).properties(width=250,height=300)
laydermap.configure_axis(grid=False).configure_view(strokeWidth=0)

A calculation about the overall deaths and recovered cases yields that the 10 countries under consideration accumulate the 79.95% and 66.51% of the deaths and recovered cases respectively. While countries which record cases between 10 and 100 thousand gather the 17.78% and 26.98% of the deaths and recovered cases accordingly. The rest 2.26% of deaths and 6.51% of recovered cases is aggregated in countries with less than 10 thousand incidents.

It can clearly be concluded that the countries with more total incidents have the more total deaths as well. Something that isn't valid when we focus on countries individually as we saw above. We can clearly see that even though many countries have been hit by COVID-19 solely 10 have affected significantly while the rest have suffered by a moderate to low impact in terms of deaths.

The 10 countries explain the 66.19025115087702 % of the overall recovered cases
The 10 countries explain the 79.77010076929926 % of the overall death cases

Data analysis of the major countries

Following the findings from the preliminary exploration analysis we focus on the 10 countries that have been affected the most by COVID-19 virus.

In order to extract more information as possible from the dataset it is necessary to combine several datasets. By doing so, we include columns referring to daily new cases, new deaths and new recovered cases. This new data refer to how much the corresponding cases changed compared to the day before.

Other, than that an investigation for missing values and treatment of those it is also a requirement to bring the dataset in form ready for analysis. In the present study the missing values were filled with zeros. It considered the best way to treat such a values because if for example the missing values were filled with the mean, mode or median could lead to false interpration of the results.

In the following tables it is shown first a sample of the final dataset about COVID-19 after the preprossesing and following up with a table contains the descriptive stastics of the dataset.

# collapse-show
# data processing to create Active, New cases, New deaths, New recovered
full_clean_data['Active'] = full_clean_data['Confirmed'] - full_clean_data['Recovered'] - full_clean_data['Deaths']

selected_data = full_clean_data[full_clean_data['Country'].isin(countries)]

for i in selected_data.index:
    date = selected_data.loc[i, 'Date']
    country = selected_data.loc[i, 'Country']
    date = datetime.strptime(date, '%Y-%m-%d')
    yesterday = datetime.strftime(date - timedelta(1), '%Y-%m-%d')
    yesterdayData = selected_data.loc[(selected_data.Date == yesterday) & (selected_data.Country == country)]
    if len(yesterdayData) <= 0:
        selected_data.loc[i, 'New cases'] = 0
        selected_data.loc[i, 'New deaths'] = 0
        selected_data.loc[i, 'New recovered'] = 0
        continue
    yesterdayData = yesterdayData.iloc[0]
    selected_data.loc[i, 'New cases'] = selected_data.loc[i, 'Confirmed'] - yesterdayData.Confirmed
    selected_data.loc[i, 'New deaths'] = selected_data.loc[i, 'Deaths'] - yesterdayData.Deaths
    selected_data.loc[i, 'New recovered'] = selected_data.loc[i, 'Recovered'] - yesterdayData.Recovered

selected_data = selected_data.fillna(value=0)
selected_data['New cases'] = selected_data['New cases'].astype(int)
selected_data['New deaths'] = selected_data['New deaths'].astype(int)
selected_data['New recovered'] = selected_data['New recovered'].astype(int)
Date Country Confirmed Recovered Deaths Active New cases New deaths New recovered
20842 2020-05-12 Italy 221216 109039 30911 81266 1402 172 2452
20896 2020-05-12 Russia 232243 43512 2116 186615 10899 107 3711
20914 2020-05-12 Spain 228030 138980 26920 62130 594 176 1841
20929 2020-05-12 Turkey 141475 98889 3894 38692 1704 53 3109
20934 2020-05-12 United Kingdom 227741 1023 32769 193949 3409 628 8

In the table below we can see the basic statistics for the 10 unders study countries. By looking the new data added seems to be less spread out than the already existed ones. Additionally, we can see that the average new deaths per day are about 204 while the new confirmed (new cases) and recovered record an average of 2713 and 848 new cases per day respectively. We can also see that the maximum values of the new data are quite high for every day cases.

Confirmed Recovered Deaths Active New cases New deaths New recovered
count 1.113000e+03 1113.000000 1113.000000 1.113000e+03 1113.000000 1113.000000 1113.000000
mean 7.441630e+04 18731.316262 5571.626235 5.011336e+04 2737.325247 205.895777 881.528302
std 1.706108e+05 36570.085483 11689.804752 1.356301e+05 5769.877837 412.379660 1965.492085
min 0.000000e+00 0.000000 0.000000 0.000000e+00 0.000000 0.000000 0.000000
25% 5.000000e+00 0.000000 0.000000 3.000000e+00 0.000000 0.000000 0.000000
50% 3.858000e+03 67.000000 73.000000 3.622000e+03 566.000000 14.000000 1.000000
75% 9.864700e+04 19736.000000 5031.000000 5.268600e+04 3086.000000 191.000000 1238.000000
max 1.347881e+06 232733.000000 80682.000000 1.034466e+06 36188.000000 2612.000000 33227.000000

Below it is illustrated how the daily new cases are spread out across time

It is observed that Italy it was the first country that appeared increasing COVID-19 incidents following by France, Germany and Spain while the rest of the countries are following shortly after. Overall from 15th of February - 15th of March the was present in all the countries. It seems that in Russia and Brazil the number of daily cases is following an increasing fashion. In Italy, Spain, Germany and less in Turkey the virus appears to record a decreasing trend. The same in France however, in some days seems that the cases show an increase and then start decreasing again. Significant, fluctuation considering that one day 3000 cases recorded while the next merely 750 (look between 29th of April and 1st of May).

In the UK the virus has remained steady in high levels since the 5th of April, while in the USA we observe that the daily cases showed a rapid increase in first days and has remained in significant high levels ever since. Both in the UK and USA the daily cases don't seem to decrease soon.

Also this steady condition seems that it has lasted more than the rest of the countries that suffered first from the virus (e.g. Italy, Spain,France, Germany).

Tip: By creating a rectangular with the mouse in the upper graph you can see the cases over time, while in plot underneath is shown cumulated new cases.

# collapse-hide
# plot
interval = alt.selection_interval()

circle = alt.Chart(tmp, title='Spread and New Cases Over Time').transform_filter(
    alt.datum.Country != 'Iran').mark_circle().encode(
    x='monthdate(Date):O',
    y='Country',
    color=alt.condition(interval, 'Country', alt.value('lightgray')),
    size=alt.Size('New cases:Q',
        scale=alt.Scale(range=[0, 3000]),
        legend=alt.Legend(title='Daily new cases')
    ) 
).properties(
    width=1000,
    height=400,
    selection=interval
)

bars = alt.Chart(tmp).mark_bar().encode(
    y='Country',
    color='Country',
    x='sum(New cases):Q'
).properties(
    width=1000
).transform_filter(
    interval
)

circle & bars

In the graphs below is illustrated the average daily cases across the countries. We can obsereve, that the recovered cases are significantly higher from the deaths apart from the UK which is the opposite. As discussed previously in the study the countries with the more confirmed cases they don't necessarily show the most deaths and/or recovered cases.

# collapse-hide
group2 = selected_data.groupby('Country')['New deaths','New cases','New recovered'].mean().sort_values(by=['New deaths','New cases','New recovered'])
group2 = pd.DataFrame(group2.round())
group2 = group2.reset_index()
# # keep only the countries with more than 100000 confirmed
new_group2 = group2

#define colors
red = alt.value('#f54242')
green = alt.value('#137E2A')
black = alt.value('#050404')

#presenting the confirmed cases per country
bars = alt.Chart(new_group2).mark_bar(size=5).encode(
    x='New cases:Q',
    y=alt.Y("Country:O", sort='-x'),color = red
)

text = bars.mark_text(
    align='left',
    baseline='middle',
    dx=3  # Nudges text to right so it doesn't appear on top of the bar
).encode(
    text='New cases:Q',color =black
)

bars2 = alt.Chart(new_group2).mark_bar(size=5).encode(
    x='New recovered:Q',
    y=alt.Y("Country:O", sort='-x'),color=green
)

text2 = bars2.mark_text(
    align='left',
    baseline='middle',
    dx=3  # Nudges text to right so it doesn't appear on top of the bar
).encode(
    text='New recovered:Q',color=black
)

bars3 = alt.Chart(new_group2).mark_bar(size=5).encode(
    x='New deaths:Q',
    y=alt.Y("Country:O", sort='-x'),color=black
)

text3 = bars3.mark_text(
    align='left',
    baseline='middle',
    dx=3  # Nudges text to right so it doesn't appear on top of the bar
).encode(
    text='New deaths:Q',color=black
)

laydermap = (bars + text).properties(width= 250,height=300)|(bars2+text2).properties(width= 250,height=300)|(bars3+text3).properties(width=250,height=300)
laydermap.configure_axis(grid=False).configure_view(strokeWidth=0)

This disrepancy among the cases across the countries means that the countries have different death and recovery rates as well as infection rate. In the graph below the aforementioned rates are shown for each country individually for the total amount of each case.

#collapse-hide
#data preprocessing
#death rate
selected_data['DeathRate'] = selected_data['Deaths']/selected_data['Confirmed'] * 100
selected_data = selected_data.fillna(value=0)

#recovery rate
selected_data['RecoveryRate'] = selected_data['Recovered']/selected_data['Confirmed']*100
selected_data = selected_data.fillna(value=0)

#infection rate
population = {'Brazil':212559417,
              'Germany':82002000,
              'Russia':144005000,
              'Turkey':82000000,
             'France':65273511,
             'Italy':60461826,
             'Spain':46754775,
             'US':331002651,
             'United Kingdom':67886011,
             'Iran':83992949}


for i in selected_data['Country']:
    for key,value in population.items():
        if i == key:
            selected_data['InfectionRate'] = selected_data['Confirmed']/value * 100

# A dropdown filter
countries = list(selected_data.Country.unique())
country_dropdown = alt.binding_select(options=countries)
country_select = alt.selection_single(fields=['Country'], bind=country_dropdown, name="Select",init={'Country': 'US'})

#plot  infection rate
filter_infectionrates = alt.Chart(selected_data, width=300, height=300, title='Infection Rate').mark_line().encode(
    alt.X('Date:T'),
    alt.Y('InfectionRate:Q', title= 'Infection Rate %'),
    color='Country',
    tooltip = [alt.Tooltip('InfectionRate:Q')]
).add_selection(country_select).transform_filter(country_select)

# plot death rate
filter_deathrate = alt.Chart(selected_data, width=300, height=300, title='Death Rate').mark_line().encode(
    alt.X('Date:T'),
    alt.Y('DeathRate:Q', title= 'Death Rate %'),
    color='Country',
    tooltip = [alt.Tooltip('DeathRate:Q')]
).add_selection(country_select).transform_filter(country_select)

# plot recovery rate
filter_recovery = alt.Chart(selected_data, width=300, height=300, title='Recovery Rate').mark_line().encode(
    alt.X('Date:T'),
    alt.Y('RecoveryRate:Q', title= 'Recovery Rate %'),
    color='Country',
    tooltip = [alt.Tooltip('RecoveryRate:Q')]
).add_selection(country_select).transform_filter(country_select)
            
(filter_infectionrates | filter_deathrate | filter_recovery) 

Exploring the initially the infection rates for each country it can be observed it has recorded an increasing trend since the firday the virus appeared for all the countries. More precicely by the day the report was written all the countries apart from the USA recond infection rates in range between 0.16%-0.30%. Relatively low percentages compare their populations which ranges between 40 million (Spain) to 200 million people(Brazil). On the other hand in the USA the infection rate records an increment of 2%. Significantly higher compare to the rest of the countries.

Moving forward to death rates, an increasing trend is still observed however more fluctuations are recorded. For example, in case of Iran there is a spike around 20-23 of February where a death rate of 100% is recorded. This is quite unsual, although the confirmed incidents at that time were quite few thus, this fluctuation might correspond to 1 or 2 deaths out of 1 or 2 incidents. Same anomalies observed also in case of France where about the same period of time a high increase in death rate is recorded followed by a rapid decrease until it started to increase again in a more steady trend. Similarly for the USA. The highest death rates are recorded in France, Italy and Spain with 15%,14% and 12% respectively.

At last, the recovery rates varies a lot among the countries. The highest recovery rate of 82% is recorded in Germany while the lower in the United Kingdom which is almost 0%. As in the death rates few anomalies observed in the first days of the virus where some countries record recovery rates of 100%. In Russia this seemed that it lasted for almost a month while in the same period of time recorded 0% death rate. This, doesn't seem very rational since after the 9th of March the results followed a revearsed course for Russia. So, it could diagnosis that recorded falsesly as COVID-19 recovers.

Reaching the end of this section we would like to compress the outcome of the analysis by illustrating the relationship among the cases for each country throught the period of COVID-19 impact. The graph below depicts the key point of the COVID-19 analysis and illustrates the confirmed cases per deaths per day since 22/01/2020 while the magnitude of the recovered cases is shown by the size of the bubbles.

Tip: The situation changing over time can be seen by drag the bar below the bubble plot

# collapse-hide
# data processing
start_date = datetime.strptime('2020-01-22', '%Y-%m-%d')

for index, row in selected_data.iterrows():
    date = datetime.strptime(row['Date'], '%Y-%m-%d')
    selected_data.loc[index, 'Day'] = (date - start_date).days
    
selected_data['Day'] = selected_data['Day'].astype(int)
# plot
select_date = alt.selection_single(
    name='select', fields=['Day'], init={'Day': 0},
    bind=alt.binding_range(min=0, max=selected_data.Day.max(), step=1)
)
alt.Chart(selected_data, title='COVID-19 Spread Over Time').mark_point(filled=True).encode(
    alt.X('Confirmed', scale=alt.Scale(zero=False)),
    alt.Y('Deaths', scale=alt.Scale(zero=False)),
    alt.Size('Recovered',scale=alt.Scale(zero=False)),
    alt.Color('Country'),
    alt.Order('Confirmed', sort='descending'),
    tooltip = [alt.Tooltip('Country'),
               alt.Tooltip('Confirmed'),
               alt.Tooltip('Deaths'),
               alt.Tooltip('Recovered')
              ],
).properties(
    width=700,
    height=400
).add_selection(select_date).transform_filter(select_date)

By scrolling the slide on the bottom of the graph a man can observe the overall impact of COVID-19 per day as well as how the virus has impacted the countries throughout the entire period which numbers 111 days so far ( by the day the report was written)

Note: The UK is hard to see due to the very low score of recovered cases

We observe that the deaths and recovered cases have been increasing as long the confirmed cases increasing. Although the rates differ among the countries as we show in the graphs above. We can see that around day 110 the countries are distinguished in three clusters. The one is the USA itself by accumulating the highest numbers of all the cases. The second cluster constitues of Spain, Italy and France due to the higher number of deaths they have (approximate average 30000) compared to the rest of the countries(third cluster) which record an approximate of 8000 - 10000 deaths.

Overall, we saw how the COVID-19 has spread out around the globe and which countries have been affected the most. We noticed that 10 countries have been mainly hit by the virus. We conclude that by summing up the number of cases of those countries compare to the rest of the world and differences in numbers were significantly high. To be more presice we found out that these 10 coutnries count the 70% of all the confirmed cases and the 80% of all the deaths. There is no unique explanation on why such developed countries were inadequate to prevent the expansion of the virus while smalled less developed ones managed to do it(e.g Denmark, Greece). The major reason could be the hesitation of their leaders to implement stict measurements to isolate the virus.

Nevertheless, moving forward to the study we would like to investigate the economic consuquences of the virus on the under study countries. As mentioned before 5 of those countries constitue the most advanced economies of the world. Thus, a potential hit in their economies due to COVID-19 could have significant impact on other countries' economies as well.

Economic Analysis

In this section we will attempt to perform an economic analysis from a macroeconimic point of view and in relation to the COVID-19 analysis above, we will try to come up with the potential coclusions on how the spread of the virus has affected the global economy

Note: Data about economic situation in Iran wasn’t available thus, Iran is not included in this part of the study

We have divided the analysis in two parts. In the first one we look at the stock market of each country and analyse the major indices of those as well as some of their primary sectors' performance. The second part focues on a more macroeconomic approach where we focus on countries' GDP, unemployment rates and inflation.

Stock Market

In the graph below the prices of stocks in major indices and primary sector are illustrated. The graph shows how the prices have been formed from 01/04/2019-30/04/2020.

Note: The stocks data don’t derive from a an open API thus it is difficult to update them daily. Thus we decided to show one year period

A short reminder that the virus ti started spreading in the focus countries around 22/01/2020 and reaching its peak between March - April for the most of the countries.

#collapse-hide
# preprocessing data
# France
stockCAC40['Symbol']='CAC 40'
CACbasic['Symbol'] = 'CAC Basic Materials'
CACconsumer['Symbol'] = 'CAC Consumer Goods'
CACservice['Symbol'] = 'CAC Consumer Services'
CACtech['Symbol'] = 'CAC Technology'
CAChealth['Symbol'] = 'CAC Health Care'
cacall['Symbol'] = 'France All Shares'
stockFRA = pd.concat([stockCAC40,CACbasic,CACconsumer,CACservice,CACtech,
                     CAChealth,cacall],sort = True)
stockFRA['Date'] = pd.to_datetime(stockFRA.Date)
stockFRA = stockFRA.sort_values(by=['Symbol','Date'])
stockFRA['Price'] = stockFRA['Price'].str.replace(',','')
stockFRA['Price'] = stockFRA['Price'].astype(float)

# Italy
stockMIB['Symbol']='MIB'
utilities['Symbol'] = 'FTSE Utilities'
Technology['Symbol'] = 'FTSE Technology'
O_G['Symbol'] = 'FTSE Oil & Gas'
Travel['Symbol'] = 'FTSE Travel & Leisure'
industrials['Symbol'] = 'FTSE Industrials'
financials['Symbol'] = 'FTSE Financials'
health['Symbol'] = 'FTSE Health Care'
chemicals['Symbol'] = 'FTSE Chemicals'
allsharesitalia['Symbol'] = 'Italy All Shares'
stockITA = pd.concat([stockMIB,utilities,Technology,O_G,Travel,
                     industrials,financials,health,chemicals,allsharesitalia],sort = True)
stockITA['Date'] = pd.to_datetime(stockITA.Date)
stockITA = stockITA.sort_values(by=['Symbol','Date'])
stockITA['Price'] = stockITA['Price'].str.replace(',','')
stockITA['Price'] = stockITA['Price'].astype(float)

# Spain
ibex['Symbol']='IBEX 35'
materialssp['Symbol'] = 'Basic Materials Industry and Construction'
consumersp['Symbol'] = 'Consumer Goods'
servicesp['Symbol'] = 'Services'
petrolsp['Symbol'] = 'Petrol and Power'
spainall['Symbol'] = 'Spain All Shares'
stockSP = pd.concat([ibex,materialssp,consumersp,servicesp,petrolsp,spainall],sort = True)
stockSP['Date'] = pd.to_datetime(stockSP.Date)
stockSP = stockSP.sort_values(by=['Symbol','Date'])
stockSP['Price'] = stockSP['Price'].str.replace(',','')
stockSP['Price'] = stockSP['Price'].astype(float)

# UK
ftse100['Symbol']='FTSE 100'
auto['Symbol'] = 'Automobiles & Parts'
forestry['Symbol'] = 'Forestry & Paper'
metals['Symbol'] = 'Industrial Metals & Mining'
telecom['Symbol'] = 'Mobile Telecommunications'
realestate['Symbol'] = 'Real Estate'
beverage['Symbol'] = 'Beverages'
ukall['Symbol'] = 'United Kingdom All Shares'
chemicalsuk['Symbol'] = 'Chemicals'
construction['Symbol'] = 'Construction & Building Materials'
stockUK = pd.concat([ftse100,auto,forestry,metals,telecom,realestate,beverage,chemicalsuk,construction,ukall],sort = True)
stockUK['Date'] = pd.to_datetime(stockUK.Date)
stockUK = stockUK.sort_values(by=['Symbol','Date'])
stockUK['Price'] = stockUK['Price'].str.replace(',','')
stockUK['Price'] = stockUK['Price'].astype(float)

# Turkey
bist['Symbol']='BIST 100'
basictu['Symbol'] = 'Metals & Mining'
chemtu['Symbol'] = 'Chem Petrol Plastic'
electu['Symbol'] = 'Electricity'
foodtu['Symbol'] = 'Food & Beverages'
industrialstu['Symbol'] = 'Industrials'
financialstu['Symbol'] = 'Financial'
ittu['Symbol'] = 'Information Technology'
tourtu['Symbol'] = 'Tourism'
stockTU = pd.concat([bist,basictu,chemtu,electu,foodtu,financialstu,industrialstu,ittu,
                    tourtu],sort = True)
stockTU['Date'] = pd.to_datetime(stockTU.Date)
stockTU = stockTU.sort_values(by=['Symbol','Date'])
stockTU['Price'] = stockTU['Price'].str.replace(',','')
stockTU['Price'] = stockTU['Price'].astype(float)

# USA
dow30['Symbol']='Dow 30'
SP['Symbol'] ='S&P 500'
nasdaq['Symbol'] ='NASDAQ'
banksus['Symbol'] = 'Banks'
financialsus['Symbol'] = 'Financials'
industrialsus['Symbol'] = 'Industrials'
insuranceus['Symbol'] = 'Insurance'
computersus['Symbol'] = 'Computers'
telecomus['Symbol'] = 'Telecommunications'
transportationus['Symbol'] = 'Transportation'

stockUS = pd.concat([dow30,SP, nasdaq,banksus,financialsus,industrialsus,
                     insuranceus,computersus,
                    telecomus,transportationus],sort = True)
stockUS['Date'] = pd.to_datetime(stockUS.Date)
stockUS = stockUS.sort_values(by=['Symbol','Date'])
stockUS['Price'] = stockUS['Price'].str.replace(',','')
stockUS['Price'] = stockUS['Price'].astype(float)

# Germany
dax['Symbol']='DAX'
autogr['Symbol'] = 'Automobile'
chemicalsgr['Symbol'] = 'Chemicals'
constructiongr['Symbol'] = 'Construction'
banksgr['Symbol'] = 'Banks'
consumergr['Symbol'] = 'Consumer'
financialsgr['Symbol'] = 'Financial'
foodgr['Symbol'] = 'Food & Beverages'
industrialgr['Symbol'] = 'Industrial'
stockGR = pd.concat([dax,autogr,chemicalsgr,constructiongr,banksgr,consumergr,financialsgr,
                    foodgr,industrialgr],sort = True)
stockGR['Date'] = pd.to_datetime(stockGR.Date)
stockGR = stockGR.sort_values(by=['Symbol','Date'])
stockGR['Price'] = stockGR['Price'].str.replace(',','')
stockGR['Price'] = stockGR['Price'].astype(float)

# Russia
moex['Symbol']='MOEX'
miningru['Symbol'] = 'Metals & Mining'
chemicalsru['Symbol'] = 'Chemicals'
electricityru['Symbol'] = 'Electricity'
oilru['Symbol'] = 'Oil & Gas'
transportru['Symbol'] = 'Transport'
consumerru['Symbol'] = 'Consumer'
financialsru['Symbol'] = 'Financial'
teleru['Symbol'] = 'Telecommunication'
stockRU = pd.concat([moex,miningru,chemicalsru,electricityru,oilru,transportru,
                    consumerru,financialsru,teleru],sort = True)
stockRU['Date'] = pd.to_datetime(stockRU.Date)
stockRU = stockRU.sort_values(by=['Symbol','Date'])
stockRU['Price'] = stockRU['Price'].str.replace(',','')
stockRU['Price'] = stockRU['Price'].astype(float)

# Brazil
bovespa['Symbol']='Bovespa'
basicbr['Symbol'] = 'Basic Materials'
electricalbr['Symbol'] = 'Electricity'
financialbr['Symbol'] = 'Industrial'
industrialbr['Symbol'] = 'Gas & Water'
consumptionbr['Symbol'] = 'Consumption'
healthbr['Symbol'] = 'Health Care'
realestatebr['Symbol'] = 'Real Estate Investment & Services'
stockBR = pd.concat([bovespa,basicbr,electricalbr,industrialbr,consumptionbr,financialbr,healthbr,
                    realestatebr],sort = True)
stockBR['Date'] = pd.to_datetime(stockBR.Date)
stockBR = stockBR.sort_values(by=['Symbol','Date'])
stockBR['Price'] = stockBR['Price'].str.replace(',','')
stockBR['Price'] = stockBR['Price'].astype(float)

# add country column
stockFRA['Country']='France'
stockITA['Country']='Italy'
stockSP['Country']='Spain'
stockUK['Country']='UK'
stockUS['Country']='United States'
stockBR['Country']='Brazil'
stockGR['Country']='Germany'
stockRU['Country']='Russia'
stockTU['Country']='Turkey'
stocks = pd.concat([stockFRA,stockITA,stockSP,stockUK,stockUS,
                   stockBR,stockGR,stockRU,stockTU],sort = True)

#dropdown
countries = list(stocks.Country.unique())
country_dropdown = alt.binding_select(options=countries)
country_select = alt.selection_single(fields=['Country'], bind=country_dropdown, name="Select", init={'Country': 'United States'})


line = alt.Chart(stocks, title='Major Index & Primary Sectors Stocks Price (Major Countries)').mark_line(interpolate='basis',size=5).encode(
    x = 'Date',
    y = 'Price',
    color='Symbol',
    strokeDash='Symbol',
    tooltip = [alt.Tooltip('Symbol:N'),
               alt.Tooltip('Price:Q')]
).properties(width=700, height=500).add_selection(country_select).transform_filter(country_select)

line

Across countries a common pattern that is observed is that almost all indices and primary sectors record a drop in their prices.

In France we observe significant drop in CAC40(Major index) from 6000 to 4000 units following by a slow increase of 2000 units from mid March until end of April. So it seems that French economy overall starts to going back to normal after the strong hit in the beginning of March.

Italy was one of the countries that impacted first from COVID-19. We can see that the sectors that affected the most were the health care and technology. Althouhg they reached back to prior COVID-19 levels by the end of April. MIB (Italy's major index) records a drop of 5000 units without showing any signigicant increase thereafter.

In Spain we observe a high drop in its major index(IBEX 35) which depicts the strongest 35 spanish companies.It also seems that the major index price follows a steady course after its drop. Meaning that the major companies in Spain don't seem to overcome the crisis quickly. In regards to the rest of the secto the consumer good sectors has impacted the most while the rest of them have droped is a smaller scale. Unfortunately not enough data could be found for other important sectors of Spain like techology and financials. As we can see for the index " Spain all shares" the overall picture of the spanish economy doesn't look that has been affected in important levels.

The UK although has had strong hit by COVID-19 the prices haven't significant drops as in Italy and France for example. Although sectors such as beverages and forestry seemed that affected the most. Probably due to decrease in exports since the hospitality sector was in inaction due to lockdown from the beginning of March.

In the USA, the country with the more incidents and deaths, the economic impact was hard if we see the drop of DOW 30 which decreased by 10000 units. Although, it has started increasing after mid March, it is still in significantly lower levels compare to pre-COVID 19 levels. S&P 500 and NASDAQ on the other hand show a much smaller decrease of 1000 and 2000 units respectively. It also seems that they have already recovered loss as by end of April. The same applies for the rest of primary sectors.

In Brazil is recorded the highest drop of major index (Bovespa) among major indices. It records a drop of 47000 units. The major index contains the market stock prices of the major companies in a country, thus we can see that many of the powerful brazilian companies have been hit by COVID-19 crisis. It seems that companies operating in the electicity sector are among them as they record a drop of approximatelly 30000 units. Interesting fact that they seem to recover in a very slow pace. The rest of the sectors (included in the study) record small decreases.

Germany, has gotten a strong hit in its industrial sector in general showing a drop of 3000 units approximatelly while the major german index (DAX) droped by approximately 6000 units since the beginining of the crisis.

It seems that the chemicals sector in Russia increased its prices during the COVID-19. On the other hand financial and oil & gas sectors affected the most during the COVID-19 period.

At last in Turkey sectors suchs food and beverages and chemicals and petrol seemed that had a high drop of 80000 and 50000 units respectively and then raise again to almost equal levels as pre-COVID 19 crisis. Aparst from tourism which seems to be quite steady throughout the year the rest of the sector record medium to high fluctuations during the COVID-19 time (March-April 2020).

From the stock analysis we saw that in all countries the major indices and few of the primary sectors of the focus countries record drops on its prices. Countries like, Spain, France, Brazil, Italy and Germany seem to be in a steady low condition and don't seem to overcome the situation any time soon according to what has been recorded so far. On the other hand UK, Russia, USA seems that move towards the pre-COVID 19 levels in faster pace. Meaning that the major companies in this countries show to adapt to the after COVID-19 era and overcome the initial shock that occured suddenly by the virus.

Macroeconomic Analysis

In the last part of the analysis we will look at the economic impact of the virus from a macroeconomic perspective. Macroeconomics is a branch of economics that studies how an overall economy behaves (focuses on the large scale). More presicely, macroeconomics studies economy-wide phenomena such as inflation, price levels, rate of economic growth, national income, gross domestic product (GDP), and changes in unemployment (Investopedia).

In the graph below major countrys' GDP, inflation and unemployment annual change rate data from IMF includings forecast of 2020 and 2021 are illustrated.

# collapse-hide
# data preprocessing
def extract_data(df, subject):
    dates = ['2014', '2015', '2016', '2017', '2018', '2019', '2020', '2021']
    d = {'Date': dates, 'Value': [df[date] for date in dates]}
    values = []
    countries = []
    _dates = []
    for country in df.Country.unique():
        tmp = df.loc[df.Country == country]
        for date in dates:
            countries.append(country)
            _dates.append(date)
            values.append(float(tmp[date]))
    
    rv = pd.DataFrame.from_dict({'Date': _dates, 'Country': countries, 'Value': values})
    rv['subject'] = subject
    return rv

unemploy = majorCountry.loc[majorCountry['Subject Descriptor'] == 'Unemployment rate']
unemploy = extract_data(unemploy, 'unemployment')
inflation = majorCountry.loc[majorCountry['Subject Descriptor'] == 'Inflation, average consumer prices']
inflation = extract_data(inflation, 'inflation')
gdp = majorCountry.loc[majorCountry['Subject Descriptor'] == 'Gross domestic product, constant prices']
gdp = extract_data(gdp, 'gdp')

# A dropdown filter
countries = list(majorCountry.Country.unique())
country_dropdown = alt.binding_select(options=countries)
country_select = alt.selection_single(fields=['Country'], bind=country_dropdown, name="Select",init={'Country': 'United States'})

filter_gdp = alt.Chart(gdp, width=300, height=300, title='GDP Growth of Major Countries').mark_line(point=True).encode(
    alt.X('Date:T'),
    alt.Y('Value:Q', title= 'Growth Rate %'),
    color='Country',
    tooltip = [alt.Tooltip('Value:Q')]
).add_selection(country_select).transform_filter(country_select)

# umemployment plot
filter_unemployment = alt.Chart(unemploy, width=300, height=300, title='Unemployment Change of Major Countries').mark_line(point=True).encode(
    alt.X('Date:T'),
    alt.Y('Value:Q', title= 'Unemployment Rate %'),
    color='Country',
    tooltip = [alt.Tooltip('Value:Q')]
).add_selection(country_select).transform_filter(country_select)

# inflation plot
filter_inflation = alt.Chart(inflation, width=300, height=300, title='Inflation Change of Major Countries').mark_line(point=True).encode(
    alt.X('Date:T'),
    alt.Y('Value:Q', title= 'Inflation Rate %'),
    color='Country',
    tooltip = [alt.Tooltip('Value:Q')]
).add_selection(country_select).transform_filter(country_select)


(filter_gdp | filter_unemployment | filter_inflation)

The projections of IMF for growth looks the same for all the countries. It seems in that sense COVID-19 crisis has affected the general economy of the under study countries significanly. We also received an initial impression about the countries' economic overview from the stock market analysis. For Brazil for example which show the highest drop in its major stock index is projected a drop from 1% in 2019 to -5 in 2020%. Although, the signs for 2021 seems positive. The major EU countries, Spain, Italy, France and Germany they seem to have important drop as well which will reach a growth -8% on averaged for all them. While USA, UK and Russia will record a growth of -6%. Turkey will have growth of -4%.

Consequently, there will be an increase in unemployment rates. For countries like Spain it will even 20% while for Italy and France it will reach 10%-11%. In Germany although the unemployment rate will increase to 4% it is still in very low levels. The same applies for the UK and Russib where unemployment is expected to reach 5% in both. In the USA the unemployment will sky rocket from 4% in 2019 to 10% in 2020. In Brazil and Turkey seems that unemployment was raising continuously the last five years and is going to climb at 14% and 17% accordingly.

Ultimately, inflation is projected to drop for all the countries in 2020 apart from Brazil which will remain steady. Low inflation means that the prices of goods and service drops. However, due to general impact of COVID-19 it seems that we are more towards deflation since too many goods/services are available and there is not enough money circulating to purchase those goods. As a result, the price of goods and services drops. Therefore, the companies have forced to impose layoffs. According to (Investopedia) the central banks aim to maintain the inflation/deflation between 2%-3% year. In this case that all the countries are within this range for 2020 and 2021(if nothing changes) according to IMF's projections. Thus, we would conclude that a general financial crisis can be avoided.

International trading

One of the important economic activites is international trading among different countries. In this section we would like to analysis whether the COVID-19 has an impact on trading activites.

By the end of 2019, major countries that are currently suffering from the virus were not being affected yet. So compare to previous seasons of 2019 the import and export volumn didn't drop too much. Some countries like Brazil and Russia even have a increase on import.

However, by the end of first quarter of 2020, most major countries have take some actions like travel limitation and lockdown which have a huge impact on international trading.

Note: We collect the data from OECD, some of the coutries in the dataset include the data of first quarter of 2020, some of them are not.

A large decrease of both import and export of US can be observed at graph below.

#collapse-hide
# data preprocessing
trade = trade.replace({'Imports in goods (value)': 'Imports', 'Exports in goods (value)': 'Exports'})
# trade data doesn't include Spain
countries = ['Italy', 'United States', 'France', 'Germany', 'Turkey', 'United Kingdom', 'Russia', 'Brazil']
trade = trade.loc[trade.Country.isin(countries)]
quarterlyTrade = trade.loc[trade.Frequency == 'Quarterly']
monthlyTrade = trade.loc[trade.Frequency == 'Monthly']

# a dropdown
country_dropdown = alt.binding_select(options=countries)
realPercent = alt.binding_radio(options=['Percentage', 'US Dollar'])
country_select = alt.selection_single(name="Select",
                                      fields=['Country', 'Unit'], 
                                      bind={'Country': country_dropdown, 'Unit': realPercent},  
                                      init={'Country': 'United States', 'Unit': 'Percentage'})

alt.Chart(quarterlyTrade).mark_bar().encode(
    x='Subject:O',
    y=alt.Y('Value:Q', title='Change Percentage(%) or Billion'),
    color=alt.condition(
        alt.datum.Value > 0,
        alt.value("steelblue"),  # The positive color
        alt.value("orange")  # The negative color
    ),
    tooltip = [alt.Tooltip('Value:Q')],
    column=alt.Column('TIME:N', title='Date')
).add_selection(country_select).transform_filter(country_select)

GENRES

Visualization

Discussion

Contribution

References

  1. Investopedia